Project : Personal Loan Campaign Modelling

Background and Context:

AllLife Bank is a US bank that has a growing customer base. The majority of these customers are liability customers (depositors) with varying sizes of deposits. The number of customers who are also borrowers (asset customers) is quite small, and the bank is interested in expanding this base rapidly to bring in more loan business and in the process, earn more through the interest on loans. In particular, the management wants to explore ways of converting its liability customers to personal loan customers (while retaining them as depositors).

A campaign that the bank ran last year for liability customers showed a healthy conversion rate of over 9% success. This has encouraged the retail marketing department to devise campaigns with better target marketing to increase the success ratio.

You as a Data scientist at AllLife bank have to build a model that will help the marketing department to identify the potential customers who have a higher probability of purchasing the loan.

Objective:

Data Dictionary:

Import the necessary packages

Read the dataset

View the first and last 10 rows of the dataset.

Check the data types of the columns for the dataset.

Check for missing values

Give a statistical summary for the dataset

Observations

EDA

Univariate analysis

Bivariate Analysis

Observations

Univariate Analysis

Bivariate Analysis

Data Preperation

Model Building

Observations

Model performance evaluation and improvement

Split Data

Build Decision Tree Model

Comments

Visualizing the Decision Tree

Text report showing the rules of a decision tree -

Reducing over fitting

Using GridSearch for Hyperparameter tuning of our tree model

Comments

Visualizing the Decision New Tree

Cost Complexity Pruning

Comments

Visualizing the Decision Tree

Comments

Creating model with 0.0032 ccp_alpha

Visualizing the Decision Tree

Comparing all the decision tree models

Comments

I think my recall sroce was the best at Decision treee with hyperparameter tuning and i shouldnt have done post-pruning or Best Model using ccp_alpha.

Conclusions/Recommendations

The model indicates that the most significant predictors for a customer are
* Which means that they should be targeting those customers with mid range CCAvg, live in a desent area, have a cd Account, Advanced/Professional education, work experience, mortgage and a family size of 4.